WEBVTT

00:00:07.000 --> 00:00:15.000
Artificial intelligence is our rapidly growing technology with the potential to revolutionize cultural heritage, preservation and dissemination.

00:00:15.000 --> 00:00:17.000
While AI offers immense benefit, it also raises important ethical considerations in today's conversation.

00:00:17.000 --> 00:00:28.000
We'll discuss how a public interest values can shape the development and the deployment of AI and cultural heritage, including how to ensure that AI reflects diverse perspectives and promotes cultural understanding.

00:00:28.000 --> 00:00:40.000
Hi, everyone! I'm Chris Freelan and I'm a librarian at the Internet Archive. Full disclosure.

00:00:40.000 --> 00:00:41.000
I did not write that Intro, the popular AI system chat Gpt did, and it did a really good job.

00:00:41.000 --> 00:00:47.000
It beat me. It did better than I did when I tried to write it.

00:00:47.000 --> 00:01:06.000
We also asked open Ais, Dolly, to to generate an image for use in today's session, which you can see here on the left, which is wild and wacky and beautiful and weird, in all the in all the kinds of interesting ways we asked it to generate an image that conveyed the opportunities challenges

00:01:06.000 --> 00:01:12.000
and ethical considerations of using AI and a cultural heritage context setting.

00:01:12.000 --> 00:01:15.000
And this is what it came up with which seems really perfect to me.

00:01:15.000 --> 00:01:20.000
So these are the, you know. Some of this was fun, workflow improvements to our webinar.

00:01:20.000 --> 00:01:28.000
But what you but what about the larger issues around using AI for cultural heritage, preservation and dissemination?

00:01:28.000 --> 00:01:29.000
Well, that's what our panelists are going to discuss.

00:01:29.000 --> 00:01:35.000
And today we're going to be joined by representatives from civil society organizations, including.

00:01:35.000 --> 00:01:36.000
Lela, Bailey.

00:01:36.000 --> 00:01:37.000
The Internet.

00:01:37.000 --> 00:02:07.000
Okay.

00:02:16.000 --> 00:02:17.000
You can turn those on using the live transcript feature of Zoom.

00:02:17.000 --> 00:02:31.000
And finally, we are recording today's session, and all registerrants will receive an email tomorrow with the with the recording and all of the links.

00:02:31.000 --> 00:02:39.000
So to start things off and to give a little background around AI and cultural heritage.

00:02:39.000 --> 00:02:41.000
And just what's happening in the in the broader world is Brewster Kale, the founder and digital librarian of the Internet Archive.

00:02:41.000 --> 00:02:42.000
Welcome to the screen. Brewster.

00:02:42.000 --> 00:02:45.000
Thank you very much, Chris. This is exciting. Times.

00:02:45.000 --> 00:02:46.000
I guess you can't have a conversation without mentioning AI.

00:02:46.000 --> 00:02:49.000
Somewhere in it. So this is a conversation all about AI and cultural heritage institutions.

00:02:49.000 --> 00:02:57.000
So I thought I'd just drop some of the excitement around the Internet architect.

00:02:57.000 --> 00:02:59.000
We're using it a lot already, and we're trying it out in many different ways, just to go and do our day-to-day tasks.

00:02:59.000 --> 00:03:08.000
So not the biggest pictures that we're going to talk about, which we're going to later.

00:03:08.000 --> 00:03:10.000
But I just thought I'd drop a couple of things.

00:03:10.000 --> 00:03:14.000
We thought, one is taking our 78 rpm.

00:03:14.000 --> 00:03:19.000
Records from 100 years ago the Edison recordings don'ated by the Uc.

00:03:19.000 --> 00:03:26.000
Santa Barbara, and ran through whisper to go and try to extract the words, and it does remarkably well.

00:03:26.000 --> 00:03:33.000
So it's not like Full lyrics, but it does give you an idea of what the song is about, and it gets a lot of it right.

00:03:33.000 --> 00:03:43.000
So I think that helps in the navigation and using of these 100 year old materials and maybe analysis across the whole whole data set.

00:03:43.000 --> 00:03:54.000
We've taught the open source Tesser Act, Ocr program to handle new fonts or fonts that are really kind of not not doing that well, to be able to handle.

00:03:54.000 --> 00:04:00.000
Diditization of microfinance a lot better. So it's just taking machine learning and getting it to be better.

00:04:00.000 --> 00:04:07.000
And then the generative AI we've we've been recording Russian, Belarusian, Iranian television.

00:04:07.000 --> 00:04:08.000
But figuring out what's on. It has been very difficult.

00:04:08.000 --> 00:04:20.000
We've used the little images and found things like fox news hosts that have appeared commonly, and those have been run and useful in Washington Post.

00:04:20.000 --> 00:04:21.000
But can we find out about what they're actually saying?

00:04:21.000 --> 00:04:29.000
And the answer is now, yes, that there's speech to text that we can do to be able to get, say, a Russian program in the speech into Russian text.

00:04:29.000 --> 00:04:46.000
Then we use automatic translation to go from Russian to English translation, to go from Russian to English imperfectly but still, and then run the summarization over it to be able to say over the course of the day.

00:04:46.000 --> 00:04:53.000
What are the top? 10 points that are being drilled by state television in Russia, Iran or Belarus?

00:04:53.000 --> 00:05:03.000
And it's doing remarkably well. Another use is we're using it to site, to parse citations from scan journal literature.

00:05:03.000 --> 00:05:04.000
So people are supposed to go and do your citations in exactly the right form.

00:05:04.000 --> 00:05:11.000
Well, they don't, and the Ocr. Gets it wrong.

00:05:11.000 --> 00:05:17.000
And 2 last ones. Oh, we put out a challenge to try to.

00:05:17.000 --> 00:05:23.000
We have thousands of hand restored 78, that basically a denise them.

00:05:23.000 --> 00:05:33.000
So we have the original and the human restored, and then we would love if anybody steps forward to take an AI challenge, to go and take.

00:05:33.000 --> 00:05:44.000
Those thousands of examples can we apply it to 400,000 other recordings that we would like to do be cleaned up, and much the same way that people had.

00:05:44.000 --> 00:05:49.000
And lastly, we've been participating with the Copyright office, and there's on our blog.

00:05:49.000 --> 00:05:58.000
There is a submission that we did to try to say how we see the opportunities and the risks to cultural institutions by having our materials used in mind and how we can use these tools.

00:05:58.000 --> 00:06:17.000
So it's an exciting time, I'm sure all of you have experience as well, and we would love to learn how you're using these new tools to further your your collections. So thank you.

00:06:17.000 --> 00:06:21.000
Chris and I, really looking forward to this.

00:06:21.000 --> 00:06:25.000
Thanks for that. Brewster. Yeah, this is gonna be a great discussion.

00:06:25.000 --> 00:06:28.000
And so let's what I'd love to do now is welcome.

00:06:28.000 --> 00:06:29.000
Our panelists to the screen, and I would love to hand the conversation over to Louis via to take it from here.

00:06:29.000 --> 00:06:37.000
Over to you, Louis.

00:06:37.000 --> 00:06:45.000
Hi, everybody! I am so excited to be here today. We you know, as I think, was already mentioned.

00:06:45.000 --> 00:06:47.000
It's really hard to have a if you're in tech, it is hard to have a conversation with anybody with your parents.

00:06:47.000 --> 00:06:55.000
With your siblings. Needless to say, with your co-workers without Ml.

00:06:55.000 --> 00:07:16.000
Coming up pretty quickly, and but I think that these conversations often tend to focus very much on oh, this cool technology that I saw, or I think, to our perhaps maturing credit as a society we are often talking these days about the artists who who's work is being supplemented, or perhaps replaced by this

00:07:16.000 --> 00:07:23.000
stuff I think one gap that hopefully, we're going to fill today is very much talking about these comments.

00:07:23.000 --> 00:07:26.000
That deliberate shows in Commons that these machine learning tools draw upon.

00:07:26.000 --> 00:07:31.000
So I'm really excited to have our panel here today.

00:07:31.000 --> 00:07:42.000
Lela, the general counsel of the Internet Archive Cat, the general counsel of creative Commons, and Jacob, the Associate general counsel of Wikimedia Foundation.

00:07:42.000 --> 00:07:54.000
I think pretty much all these organizations do not need introduction for anybody here, because it's a great set of organizations responsible for so much of our digital commons before we start.

00:07:54.000 --> 00:08:08.000
Since I know there's sort of a breadth of experience in our audience today, I want to call out one particular thing that's gonna come up a fair bit in many of our conversations today.

00:08:08.000 --> 00:08:17.000
It's important, I think, to understand that from a copyright perspective, everybody here is a copyright lawyer that we're not gonna get too deep on copyright law today.

00:08:17.000 --> 00:08:19.000
I don't think they, but both from a copyright law perspective and from an ethical perspective.

00:08:19.000 --> 00:08:33.000
There are different rules and different impacts when we're talking about the materials that are inputted into machine learning training and the materials that are taken out of machine learning.

00:08:33.000 --> 00:08:48.000
Once we create new things. And so we're gonna try, we hope, as a panel to sort of keep those topics separate, or at least flag when they're importantly separate.

00:08:48.000 --> 00:08:49.000
But if you do have questions, please feel free to ask them.

00:08:49.000 --> 00:08:54.000
In the chat. We're going to try to keep on top of that as much as we can.

00:08:54.000 --> 00:09:18.000
I want to start with our host, Leila. You know we as I was preparing for this, I was really thinking about Internet Archive as a a source, that machine learning draws on, because, of course, obviously, you all are so critical to archiving and taking in the World wide web and that's a key source

00:09:18.000 --> 00:09:24.000
of training material for so much of the new cool machine learning technical techniques that we've seen.

00:09:24.000 --> 00:09:25.000
But it's clear also from the introduction just heard from Brewster.

00:09:25.000 --> 00:09:39.000
You all are doing a whole lot with generative AI techniques to improve the Commons and to sort of mesh the Commons together.

00:09:39.000 --> 00:09:43.000
What's that been like for you a along this ride right?

00:09:43.000 --> 00:09:51.000
I mean, I expect you have to make that clarification to a lot of people just to me.

00:09:51.000 --> 00:09:58.000
You are still muted, and you just remuted yourself.

00:09:58.000 --> 00:10:02.000
Rookie mistakes over here. Sorry about that. Yeah.

00:10:02.000 --> 00:10:09.000
So what? What has it been like? I would say it has been exciting, you know.

00:10:09.000 --> 00:10:25.000
I think over the last, you know, I would say, since 2016 there's been a lot of you know, aunts going on in the tech community about like social media, and the and kind of the ways in which we were.

00:10:25.000 --> 00:10:45.000
Our optimism was not quite born out by sort of democratizing access to like the just published, published, publish model, and I think what is really exciting about and also interesting about the AI space is the Internet feels playful.

00:10:45.000 --> 00:10:55.000
Again, like there's so much sort of fun. Creativity, quirky, interesting, funny stuff happening in this space.

00:10:55.000 --> 00:11:03.000
I think that's happening both, you know, internally. At the end of the archive, but also just, you know, out there across the open web.

00:11:03.000 --> 00:11:25.000
And so I think what's been really exciting is is seeing the ways in which these new tools are sort of pumping out new stuff for us to kind of chew on and play with, and you know, so I think that's been really exciting as as a an archive and an

00:11:25.000 --> 00:11:30.000
institution that thinks about keeping stuff around for the long term.

00:11:30.000 --> 00:11:34.000
I have a lot of curiosity, and I know we're thinking a lot about kind of where will the outputs of AI live?

00:11:34.000 --> 00:11:37.000
And should it be in the Commons? And who's going to archive that?

00:11:37.000 --> 00:11:54.000
And for the long term right like, especially if, as the copyright offices has so far said, you know, a lot of these outputs are are not copyrightable.

00:11:54.000 --> 00:12:01.000
That means their basically born public domain, which means, Wow, that's a huge influx of information and culture into the comments which I think is really exciting and interesting.

00:12:01.000 --> 00:12:13.000
We're gonna have all the same. And more problems as we've had with social media, especially with just floods and floods of information.

00:12:13.000 --> 00:12:14.000
And so what we're hoping to do at the Internet Archive is to help provide.

00:12:14.000 --> 00:12:36.000
You know that like library infrastructure, things like context and provenance, and the ability to verify sources and get back to some kind of shall we say, like human legible truth, or at least verification processes.

00:12:36.000 --> 00:12:41.000
So as a as sort of a cultural institution that's always been in the digital space.

00:12:41.000 --> 00:12:44.000
I think this is one of like this is our time to shine.

00:12:44.000 --> 00:12:52.000
So I'm really, really excited to be here at this moment and super excited to have this conversation with these organizations today.

00:12:52.000 --> 00:12:56.000
Yeah, I mean, I think it's it's very exciting to be here.

00:12:56.000 --> 00:12:59.000
And we're gonna switch to Jacob in a second.

00:12:59.000 --> 00:13:04.000
I think I really wanna echo what you said about it being fun that there is a certain.

00:13:04.000 --> 00:13:05.000
There's certainly a lot of challenge and a lot of weighty ethical questions.

00:13:05.000 --> 00:13:11.000
But there's also joy, not just in this conversation, but once happening all over, which I think I really love that observation.

00:13:11.000 --> 00:13:13.000
So, Jacob, I want to switch to you, you know.

00:13:13.000 --> 00:13:27.000
Lla was just talking about the hey we provide all this stuff that other people can cite, and you are, of course, here to represent the world's most famous source of citations.

00:13:27.000 --> 00:13:35.000
Of the Wikimedia Foundation and its one of its several press.

00:13:35.000 --> 00:13:43.000
Wikipedia, and you know you all are in sort of an interesting space right where you are both.

00:13:43.000 --> 00:13:49.000
Many of the largest models, especially text models, are trained.

00:13:49.000 --> 00:13:50.000
Repeatedly on Wikipedia Wikipedia is overweighted relative to other websites.

00:13:50.000 --> 00:14:07.000
In these training processes, but at the same time you all are also stewards of a community that is creating new content every day and so you're sort of sitting in the middle of there.

00:14:07.000 --> 00:14:10.000
How's that ride been from your perspective?

00:14:10.000 --> 00:14:11.000
Yeah. Thanks, Louis, and very exciting to be here today.

00:14:11.000 --> 00:14:20.000
I think there's definitely some things coming at with Media and and Wikipedia in particular, from a few different directions.

00:14:20.000 --> 00:14:26.000
So one thing to note is the the Wikipedia editor.

00:14:26.000 --> 00:14:29.000
Community is very large, and is very quick to jump on a lot of these things in different ways.

00:14:29.000 --> 00:14:50.000
And so you can see people are already like discussing community policies there was an editor named User named Pharaohs, who has already tried to use Chat Gpt to generate an article called art title which is then subsequently been subject to quite a Lot, of Human

00:14:50.000 --> 00:14:54.000
editing as an experiment to see how Chap Tt would actually do in generating Wikipedia, content and I think the answer, so far is not actually that.

00:14:54.000 --> 00:15:03.000
Well, we're the community's coming down in terms of the use of these things.

00:15:03.000 --> 00:15:19.000
For Wikipedia is that it takes quite a lot of human effort to take something that is generated from a chat bot and then actually make sure that it is correct, that the citations it has, if it has any, are correct to get it citations where it needs them and to turn it into something that

00:15:19.000 --> 00:15:26.000
is useful and accurate on Wikipedia, and I think it's interesting if you compare with what Lila was saying about Internet Archive.

00:15:26.000 --> 00:15:44.000
Wikipedia is a very large collection of knowledge, but it's it's looking to organize that knowledge according to a certain kind of like academic model, where you want information, there to be well written in a certain neutral tone, with reliable sources and there should only be articles

00:15:44.000 --> 00:15:47.000
about topics that are notable, and how those kinds of reliable sources.

00:15:47.000 --> 00:15:52.000
It's not literally everything under the sun, and we are not necessarily the first quarter. Call to archive everything.

00:15:52.000 --> 00:15:59.000
That's more. The Internet Archive side of things. And instead, there's more of a like refinement that is coming out of that.

00:15:59.000 --> 00:16:06.000
And I think that's probably why, going back to what you were saying, why, it is actually overweighted in a certain way, when it comes to training the Ais.

00:16:06.000 --> 00:16:19.000
Because Wikipedia represents the collective output of a lot of humans working together to try to get knowledge into a format that has reliable, accurate citations about a variety of different topics, and so that's something that is I think people who are training.

00:16:19.000 --> 00:16:27.000
These systems are looking for exactly that kind of data when they're doing this kind of of work.

00:16:27.000 --> 00:16:28.000
It is. Yeah.

00:16:28.000 --> 00:16:33.000
It's also simply just got better copy editing than the average Internet page.

00:16:33.000 --> 00:16:34.000
I would suspect.

00:16:34.000 --> 00:16:35.000
That is true. Yeah, well, I mean, I think it really goes to that human effort.

00:16:35.000 --> 00:16:38.000
It's like there's a level of time and effort that is going into producing cloud.

00:16:38.000 --> 00:16:46.000
And it's that human community that is really, really important.

00:16:46.000 --> 00:16:55.000
And that's it's one of the things that I think about in terms of how these are affecting things, because, on the one hand, as we were saying in the intro, you can use Chat Tt.

00:16:55.000 --> 00:17:04.000
To help you write something like a good introduction to a session, and if it does help people overcome writer's block and produce good quality content, working like humans.

00:17:04.000 --> 00:17:07.000
And I working AI working together. Then that's great.

00:17:07.000 --> 00:17:13.000
That might make it easier for people to contribute to Wikipedia and make more high quality content on the flip side.

00:17:13.000 --> 00:17:27.000
If it's producing a lot of poor quality content and wasting a lot of people's time, we it might make it a lot more difficult to contribute to Wikipedia and make it harder for good quality human-produced knowledge to be out in the world and I think it's still

00:17:27.000 --> 00:17:32.000
something that we're seeing happening live in terms of the balance of those 2 effects.

00:17:32.000 --> 00:17:41.000
I mean, I I was gonna say, you know, this experimenting is already very much happening in a in the real world, right?

00:17:41.000 --> 00:17:48.000
You know a particular article that's already out there, and I know there's been other experiments as well.

00:17:48.000 --> 00:17:51.000
Is is it too early to draw conclusions, or are there?

00:17:51.000 --> 00:18:05.000
And of course I might mention, I think, for our audience, who might not, who might not be fully conversion with how Wikipedia works we Couldedia is not one community.

00:18:05.000 --> 00:18:18.000
It is many different communities, so the conclusions that English Wikipedia might draw may be very different from Japanese, Wikipedia, or Finnish, Wikipedia, or Tagalog Wikipedia.

00:18:18.000 --> 00:18:26.000
Yeah, I think I mean, I'm at least seeing it being very useful in certain functions, I think, actually looking at the multiple language communities.

00:18:26.000 --> 00:18:36.000
It seems like translation is one of the things that these AI machine learning programs are the best at you can generate a very accurate translation of some existing content.

00:18:36.000 --> 00:18:46.000
And then throw that at somebody in say, Hey, take a look at this just sort of double check it, but the amount of human work to get a good quality translation using these programs is relatively low.

00:18:46.000 --> 00:18:52.000
And so I think getting a lot of content into multiple languages is actually a really good use of these types of programs.

00:18:52.000 --> 00:18:55.000
More broadly. I am, you know I'm personally a little bit optimistic.

00:18:55.000 --> 00:19:06.000
I'm leaning towards the view that Wikipedia will continue to be a source of good quality content, even as maybe more lower quality.

00:19:06.000 --> 00:19:07.000
Content is produced by AI in the rest of the Internet.

00:19:07.000 --> 00:19:23.000
The model of doing research and finding good sources, and putting that all together to create encyclopedia articles is something that I think will only be increasingly valuable both for machines and humans.

00:19:23.000 --> 00:19:26.000
As we we move forward in this area.

00:19:26.000 --> 00:19:31.000
Yeah, I think I suspect that we will come back several times to this question of Okay, well, what's left for humans?

00:19:31.000 --> 00:19:37.000
So what are we doing better or differently?

00:19:37.000 --> 00:19:42.000
As a result and actually, that's a great transition for our last panelist cat.

00:19:42.000 --> 00:20:01.000
Walsh, Kat, you are the drone counsel of creative Commons, which is an organization that I think, perhaps unfairly, has come to be stereotyped a little bit as a by your most successful project, which is of course your copyright licenses. But but the organization.

00:20:01.000 --> 00:20:04.000
I think, has always aimed at being more than just a copyright organization, right?

00:20:04.000 --> 00:20:13.000
And certainly in this current moment, you're thinking about a lot of stuff beyond the scope of the 4 walls of copyright.

00:20:13.000 --> 00:20:19.000
Do you want to tell us a little bit about that?

00:20:19.000 --> 00:20:30.000
Sure, so as a generative AI has become more popular, people have come to Seec and asked us like like, What can you do to fix this?

00:20:30.000 --> 00:20:40.000
And it turns out that what people mean by this can be a whole variety of things, and some of them are like, Oh, how do I make sure that I am crediting work appropriately?

00:20:40.000 --> 00:20:43.000
Or how do I make sure that I get credit for my work?

00:20:43.000 --> 00:20:54.000
And sometimes those problems are much broader than copyright. They're like, I'm afraid of what this means for the future, of art, for the future, of people like me, like for the future of creation.

00:20:54.000 --> 00:20:55.000
And some of these problems, we can produce a guide for.

00:20:55.000 --> 00:21:06.000
And some of these problems are much broader than that. And we've been having conversations with all of our community to try and figure out what what this is.

00:21:06.000 --> 00:21:26.000
So we are. We are known for our best product, the the CC licenses and those were created about 20 years ago to address a problem that was existing at the time which the default rules of copyright did not match the way that people wanted to share either creators or reusers so we

00:21:26.000 --> 00:21:37.000
came up with the suite of licenses that more closely matched that, and let people do what they already wanted to do and needed to do legally and gave them a good framework to make that easier.

00:21:37.000 --> 00:21:40.000
But CC. Is not editor's art. A copyright organization?

00:21:40.000 --> 00:21:48.000
It's just the tool that we use for our real mission, which is to reduce barriers to the sharing and creation of knowledge.

00:21:48.000 --> 00:21:55.000
We want to empower everyone to be readers, to be creators, to be able to participate fully in culture.

00:21:55.000 --> 00:21:56.000
So now we're coming up to a new like shift in the assumptions that all of our tools rest on.

00:21:56.000 --> 00:22:06.000
What does it mean to participate in cultural in this new?

00:22:06.000 --> 00:22:19.000
You know, in this new world of generative AI. And that's a that's a much harder question that we're trying to look at, because we are an organization that doesn't believe that copyright exclusivity is the best way to promote culture.

00:22:19.000 --> 00:22:33.000
But a lot of the things that have that people have used as incentive to create or as a guidelines for what is ethical and like.

00:22:33.000 --> 00:22:36.000
What should people do? Have rested on copyright and licensing like?

00:22:36.000 --> 00:22:52.000
Oh, I can share it under these terms, like as long as I do these things, and the way that the way that these systems are working or breaking some of these assumptions, and now we're having to think about like more of the concerns about what should we recommend people do and we don't want to

00:22:52.000 --> 00:23:06.000
explain and copyright for that. So what are we? What should we recommend as an organization? And what are the right tools for that? And that's such a that's such a more interesting and more difficult question?

00:23:06.000 --> 00:23:22.000
Right. I mean, I did say that we were not gonna get too much into copyright today, but but I think you're right that the sharing community on the Internet has copyright has been our hammer.

00:23:22.000 --> 00:23:30.000
And so everything has been a nail, and right now we are finding very actively the limits of that approach right.

00:23:30.000 --> 00:23:37.000
You know, I wanna I think a thing that's been coming up in a few different ways right?

00:23:37.000 --> 00:23:44.000
Lla, you're talking about? Well, Brewster was talking about the ability to input things from many languages.

00:23:44.000 --> 00:23:45.000
Jacob, you were talking about outputting in many languages.

00:23:45.000 --> 00:23:52.000
Cat. You know you didn't mention this directly, but certainly one of the things that CC.

00:23:52.000 --> 00:23:58.000
Has been thinking about for very long time is how we do this in an international way.

00:23:58.000 --> 00:24:05.000
Are there? I mean, that's for those of us who have dreamed of a boundary list world.

00:24:05.000 --> 00:24:06.000
That's very exciting. Are there challenges that you all are seeing in practice?

00:24:06.000 --> 00:24:16.000
Are there concerns or edge cases, that we need to be working about as we try to bring.

00:24:16.000 --> 00:24:24.000
Bring our comments to a broader world.

00:24:24.000 --> 00:24:27.000
Jacob, I see you nodding. You wanna jump in first.

00:24:27.000 --> 00:24:30.000
Sure I think it's a tough question to answer, because it gets into issues of like context and quality.

00:24:30.000 --> 00:24:50.000
And how different things are perceived. And there's some things that you can kind of play around with more innocently than others, like an example of this actually, from a number of years ago, for a long time the Swedish language Wikipedia was the second largest one behind

00:24:50.000 --> 00:24:55.000
English, and the reason for that, even though there is sorry that many people writing in Swedish was that they had generated.

00:24:55.000 --> 00:25:15.000
A they had a bot that was generating sub articles about geographic locations like any place in an and around Sweden or all sorts of different geographies, that it could find a few citations about, and generate a fairly standardized stub article that people could expand later and it generated

00:25:15.000 --> 00:25:25.000
I think well, over a million of these, not all of equal quality, either, like some of them, have ultimately been deleted, while others were expanded and so you can see like that was in years ago.

00:25:25.000 --> 00:25:33.000
Well, before, like Chat Gtt. Existed in any model, but it's the kind of thing where it's like that's not that hard to generate, and it's a relatively safe experiment doing like geographic locations doesn't run into a lot of the problems.

00:25:33.000 --> 00:25:56.000
You potentially have. But then, if you're starting to translate things that are more sensitive, like biographies about people for example, you definitely run the risk of starting to generate statements that may be read inaccurately, or that may in a different language, create a different impression about someone I think that

00:25:56.000 --> 00:25:57.000
can be especially sensitive if you start getting into things like people who committed crimes, or maybe were accused of crimes.

00:25:57.000 --> 00:26:07.000
But it was never proven that they committed that, and getting that nuance across can be really difficult, as you do.

00:26:07.000 --> 00:26:15.000
Translations, our philosophy, at least for now, has continued to be that these tools are best when they help people be more efficient.

00:26:15.000 --> 00:26:18.000
And so it is, at least on Wikipedia right now.

00:26:18.000 --> 00:26:25.000
There's no direct transation happening that gets published just by a machine it's there's a translate tool.

00:26:25.000 --> 00:26:35.000
You can use the machine as an as a human editor, you can use the machine to generate a bunch of sort of pre translated texts that it then asks you to check over.

00:26:35.000 --> 00:26:42.000
But you're you're taking responsibility as the human translator for reading that over before you say yes, put this out there, publish it.

00:26:42.000 --> 00:26:50.000
And so we're asking at least for that much that like human beings, are kind of checking over what is being generated and making sure that it does make sense.

00:26:50.000 --> 00:26:56.000
I think that part is really important, even if it's even if it's way easier than doing the full translation yourself.

00:26:56.000 --> 00:27:01.000
Having that human set of eyes. It's really important.

00:27:01.000 --> 00:27:02.000
Yeah, I, think, I think Jacob makes some really great points.

00:27:02.000 --> 00:27:15.000
There! The other. So one of the other projects that Brewster didn't mention at the top of the of the session was, Is our Democracy's library project right?

00:27:15.000 --> 00:27:38.000
So so democracy's library is kind of a moonshot idea that we have at the Internet Archive of bringing together all of the published works of all of the democratic governments in the world into one sort of useful reliable library of democratically produced information there's obviously some interesting copyright

00:27:38.000 --> 00:27:44.000
questions there. But I but the thing that is, I think.

00:27:44.000 --> 00:27:48.000
Interesting from this sort of global national national borders.

00:27:48.000 --> 00:28:03.000
Idea is, you know, at democracy tends, tends towards the open, and tends towards the transparent, whereas sort of autocracies, and you know not illiberal.

00:28:03.000 --> 00:28:09.000
Undemocratic institutions tend towards the sort of lockdown enclosed.

00:28:09.000 --> 00:28:14.000
And I know as a lot of as we are sort of generating all of this information.

00:28:14.000 --> 00:28:38.000
One of the the scary questions out there is like, well, how much of the information of our democratic government should be out there and available to the Russia's and the China's, and the the more autocratic illiberal countries, and like there are some just very serious national security

00:28:38.000 --> 00:28:41.000
questions there that I know of the folks that are working on Democracy's Library are thinking very hard about.

00:28:41.000 --> 00:28:49.000
And so this is well out of my depth, as like mostly just a copyright nerd.

00:28:49.000 --> 00:28:55.000
But I think there are a lot of interesting tensions with the sort of our communities being this default open default, transparent space.

00:28:55.000 --> 00:29:17.000
And then what that means for not just sort of individual sort of potential harms and privacy issues that Jacob was talking about, but also kind of like, what does this mean for institutions, governments?

00:29:17.000 --> 00:29:23.000
And so again. I don't have any answers there, but I think that that is one of the like bigger challenges.

00:29:23.000 --> 00:29:30.000
That is being kind of brought to the fore by these technologies.

00:29:30.000 --> 00:29:31.000
Cool.

00:29:31.000 --> 00:29:39.000
Well, I mean, I think that's a great we had we had somebody in chat asked, what happens when you have models that can run on our phones right now.

00:29:39.000 --> 00:30:02.000
A particular concern of mine as somebody who's been involved in open source software for a very long time is, what does this do to power in the software industry right?

00:30:02.000 --> 00:30:03.000
Ps.

00:30:03.000 --> 00:30:09.000
Because there's one vision where as many of our audience may be aware, training a lot of these models has been very expensive, in part, because you might wanna have an entire copy of the Internet Archive just hanging around which you know, here's a good plug go donate to the Internet

00:30:09.000 --> 00:30:26.000
archive because hosting all of it isn't cheap, and you but if you, as a private entity, you want to have a copy of Internet Archive around to train your machine learning on, we'll guess what that is also not cheap, so there has been at least some concern I think that machine learning

00:30:26.000 --> 00:30:31.000
is gonna centralize power in the kinds of entities that can afford that kind of thing right?

00:30:31.000 --> 00:30:46.000
Do we think that having that investing in open commons like Archive Wikipedia, you know Ccmicense Commons like Flickr?

00:30:46.000 --> 00:30:57.000
Do we think that's enough? Or do we need to be building other like open machine learning infrastructures, you know, on top of that, in order to distribute power?

00:30:57.000 --> 00:31:03.000
I mean, I'll just again like this is a little over by over my head in terms of like what my expertise is.

00:31:03.000 --> 00:31:09.000
But you know, the Internet came from the government right like, that's kind of where it was born.

00:31:09.000 --> 00:31:24.000
It was a public project, and I think the fact that a lot of AI is happening in a corporate environment where it's the profit motive is the core.

00:31:24.000 --> 00:31:36.000
The core driver of what's happening. I think that is a real source of problems and anxiety that as a society, we have.

00:31:36.000 --> 00:31:38.000
And that's why I think yes, please donate to all of our organizations.

00:31:38.000 --> 00:31:44.000
We need the help now. Truly, now, more than ever, because, yeah, it's it's a flood of information out there.

00:31:44.000 --> 00:32:00.000
And our organizations are here to try to share knowledge and not just stuff right like, which might be bad stuff.

00:32:00.000 --> 00:32:07.000
And so. But I actually think you know, government needs to catch up.

00:32:07.000 --> 00:32:08.000
Europe is moving along, but you know I don't know what our government is going to be able to do.

00:32:08.000 --> 00:32:22.000
Given, how much gridlock we have, but a lack of real government involvement.

00:32:22.000 --> 00:32:28.000
I think, is leading us to a very scary place, very fast.

00:32:28.000 --> 00:32:38.000
Yeah, yeah, well, I mean, this, I think, gets to, you know, one of the themes of what people are talking about in chat is very much this distinction between the what's legal.

00:32:38.000 --> 00:32:40.000
What's technically possible, and what's and what's ethical, right?

00:32:40.000 --> 00:32:54.000
What trying to do the right thing. And so, Kat, I wanna come back to you a little bit, because one of the ways in which us law and I think we might wanna bring in EU law as well.

00:32:54.000 --> 00:32:58.000
But one of the ways in which us addresses that gap between what is legally allowed and what is ethnicically a good idea is fair use right?

00:32:58.000 --> 00:33:16.000
Is the idea that you have some extra permissions. If you're an educator, if you are not seeking to copy for for purposes of profit, how?

00:33:16.000 --> 00:33:22.000
You know you all your licenses strongly endorse fair use.

00:33:22.000 --> 00:33:27.000
How have you been seeing that play out in this space?

00:33:27.000 --> 00:33:40.000
Yeah, I think it's broken. A lot of people's intuitions about it, because people see like the see, an output that looks a lot like some inputs like resembles some inputs.

00:33:40.000 --> 00:33:42.000
And they think, Hey, isn't that a copy like, shouldn't there?

00:33:42.000 --> 00:33:46.000
Shouldn't there be credit? Isn't doesn't that need to be licensed?

00:33:46.000 --> 00:33:58.000
And but because of like, because of the way it works, that there's not a direct connection between like you put something in, you, get something out that like, took some bits and pieces from it, there's not a you know.

00:33:58.000 --> 00:34:08.000
There's not a way to say like, Oh, yes, this. This use needed to be licensed, and I think it varies a lot also to.

00:34:08.000 --> 00:34:13.000
Depending on the ways that people are using. The tools like they want to get something out that looks like it.

00:34:13.000 --> 00:34:19.000
Maybe that feels a little bit more like getting a copy than somebody who's just like Generate me.

00:34:19.000 --> 00:34:24.000
Fantasy. Art, that, you know, has a you know, has a princess, and has a castle, and and they get something out.

00:34:24.000 --> 00:34:29.000
That kind of resembles existing art, but they were like they were never looking for it.

00:34:29.000 --> 00:34:35.000
So much of. I feel like so much of copyright law, particularly in the U. S.

00:34:35.000 --> 00:34:39.000
Which is very case law based has has kind of been based on vibes.

00:34:39.000 --> 00:34:44.000
Does this feel right? Does this not feel right? And you'll see the decisions make?

00:34:44.000 --> 00:34:50.000
You know a lot of tortured analogies to try to justify which side they came out on, but a lot of it, I think, really comes down to like.

00:34:50.000 --> 00:34:56.000
Does this feel like? It is supporting like it is supporting previous decisions, or or does it not?

00:34:56.000 --> 00:34:59.000
Does this feel like something that should have been licensed, or doesn't it?

00:34:59.000 --> 00:35:03.000
Which makes it hard as a lawyer to give a very black and white answer.

00:35:03.000 --> 00:35:20.000
I tend to be. I tend to be biased on the side of thinking of these things as a a transformative use, you know, in copyright there's a there's the idea that but not like the basics of that idea.

00:35:20.000 --> 00:35:30.000
And like not the description of that idea, and so so many of these systems seem to be, you know they're not copying a particular expression.

00:35:30.000 --> 00:35:41.000
They're copying like a vibe. They're copying a field, and but should that be so disconnected that that nobody ever gets credit for their work like, what does that does that incentivize people to create?

00:35:41.000 --> 00:35:48.000
Or does it incentivize people to leave their works in private like that?

00:35:48.000 --> 00:35:54.000
Like that. I don't know, and it and it really has depended on where people are coming from like for me.

00:35:54.000 --> 00:36:03.000
I as a hobbyist Creator like I'm happy to see my works being being used and remixed and trimmed.

00:36:03.000 --> 00:36:07.000
Swarmed in this way, even if I don't get creds that I've done might might end up out in the world.

00:36:07.000 --> 00:36:10.000
But I also don't make a living from my art, and I don't need the.

00:36:10.000 --> 00:36:13.000
I don't need the credit for it, and other people.

00:36:13.000 --> 00:36:14.000
Other people do. And maybe those people are creating most of the art.

00:36:14.000 --> 00:36:21.000
So yeah, it's depended so much on where people are coming from.

00:36:21.000 --> 00:36:37.000
Yeah, I mean, I have looked at the question of what it would take for co-pilots to attribute all the code compil it for those not familiar is is a generous AI that outputs computer code, and one estimate is it would take something like a 200

00:36:37.000 --> 00:36:47.000
1 million. Page Pdf. To do attribution which is not, turns out not super useful, maybe to have a 200 million page Pdf. Listing.

00:36:47.000 --> 00:36:51.000
Everybody who's ever done open source, you know. I wanna get to this question.

00:36:51.000 --> 00:36:53.000
Yeah.

00:36:53.000 --> 00:37:11.000
I think one of the one of the things that we use fair use for in this country is to try to broad in the scope of what's in the public sphere for discussion, and I think one of the again, a topic that's been coming up in chat and I think is near and dear to all of us here.

00:37:11.000 --> 00:37:17.000
is this question of representation in our in the public sphere?

00:37:17.000 --> 00:37:28.000
Right in the comments. You know Wikipedia has certainly undertaken a lot of efforts to broaden what kinds of knowledge there's a great effort called whose knowledge that're literally asked the question.

00:37:28.000 --> 00:37:29.000
If we say that Wikipedia is the sum of all human knowledge.

00:37:29.000 --> 00:37:33.000
Okay. Well, whose knowledge is represented in that all human knowledge.

00:37:33.000 --> 00:37:42.000
And I think we've all seen examples of where, if anything, it appears that machine learning may not just copy these biases that are in our data sets.

00:37:42.000 --> 00:37:58.000
But can often reinforce those biases right? It can be even more sexist, even more racist than the underlying data center that it's drawn from on the flip side.

00:37:58.000 --> 00:37:59.000
You have a lot of concerns about autonomous about indigenous data autonomy, right?

00:37:59.000 --> 00:38:06.000
The idea that we actually don't want all of certain spheres of knowledge, public or trainable.

00:38:06.000 --> 00:38:20.000
I was wondering what you know particularly, Jacob. I know that your organization has thought of a lot about this, though I see Lela nodding as well.

00:38:20.000 --> 00:38:25.000
You know, if either of you want to jump in with some thoughts on that, I'm sure it'd be welcome.

00:38:25.000 --> 00:38:36.000
Yeah, I'll start on this. I think it's definitely a tricky problem, because you get you get a contrast between people who are trying to preserve knowledge, especially about like different.

00:38:36.000 --> 00:38:43.000
You know, things like indigenous works that may not exist anymore, or languages that are have a relatively small number of speakers.

00:38:43.000 --> 00:38:55.000
But then also, like I think it is correct to say that a lot of these technologies are in some ways extractive of these of these groups, and it can be really challenging to combine those 2 things.

00:38:55.000 --> 00:38:56.000
There was a really interesting piece that was written. I am blanking on the authors.

00:38:56.000 --> 00:39:09.000
I feel so bad about this. But it was by a group of Maori language preservation folks that looked at how Whisper AI was created.

00:39:09.000 --> 00:39:18.000
How well it did in translating Mori and the the ways that it was maybe like having difficulties, and also how they got the data to be able to do that language.

00:39:18.000 --> 00:39:38.000
In the first place, which included a lot of like contacting professors who spoke Maori, and just kind of asking for them to provide relatively low paid labor, like listening to and commenting on spoken text or speaking phrases out loud in order to help train the AI and so there's a

00:39:38.000 --> 00:39:43.000
lot that goes into making these things that can get really, Dicey, very quickly.

00:39:43.000 --> 00:39:49.000
I think when we're at our best like both, not just in AI, but in the overall, like open source and free culture movement.

00:39:49.000 --> 00:39:52.000
When we're at our best, we are inviting people from these different cultures to contribute and to contribute in ways that they feel accurately reflects their culture.

00:39:52.000 --> 00:40:11.000
It helps to preserve their knowledge for the world and I think that's what we wanna be aiming towards is creating an environment that invites a diverse group of contributors, so that it's not people like a small group of people from a few places.

00:40:11.000 --> 00:40:20.000
Writing about everybody else, but instead inviting people from many different cultures and many different backgrounds to contribute their knowledge.

00:40:20.000 --> 00:40:35.000
Yeah, that's good. So well said. And I guess I will say like the way that the Internet archives seeks to support that is to ensure that you know the primary sources.

00:40:35.000 --> 00:40:36.000
If you will, are available to the folks who want to contribute right.

00:40:36.000 --> 00:41:04.000
So one of our projects from I don't know. A few years back was we worked with Wikipedia and with a Japanese organization called Den Show to look at the Wikipedia article about the Japanese internment experience and identified places where that experience needed to be

00:41:04.000 --> 00:41:11.000
bolstered and worked with with that community. This was all grant funded by, I believe it was the Us.

00:41:11.000 --> 00:41:12.000
Park Service funded this, which was amazing.

00:41:12.000 --> 00:41:14.000
So we were able to go ahead and purchase somewhere.

00:41:14.000 --> 00:41:34.000
But around 500 extra books that were considered to be, you know, authoritative on this topic, and then work with specific scholars in that community to add to Wikipedia right?

00:41:34.000 --> 00:41:40.000
So, and then ensuring that those resources that were cited are available.

00:41:40.000 --> 00:41:44.000
You know one click away from the Internet Archive, right? So that's not a generative.

00:41:44.000 --> 00:42:09.000
AI thing. But the way I think about this is is that invitation to the communities who want to to, who want to participate in the Commons and want to build up that knowledge base and ensuring that those things are done with, and respectfully of those communities I think with the AI question the

00:42:09.000 --> 00:42:15.000
you know, there's just there's just a hard question of well, either we make sure that all the stuff is in there, and that's going to be the least biased.

00:42:15.000 --> 00:42:23.000
It's going to be, because it will at least have these perspectives.

00:42:23.000 --> 00:42:27.000
On the other hand.

00:42:27.000 --> 00:42:37.000
A lot of folks don't want to participate, and they want their stuff out that they don't want it to be in there at all, and that's you know, that that is a choice of agency and human dignity.

00:42:37.000 --> 00:42:39.000
And you know. So I think there's just a lot of hard questions.

00:42:39.000 --> 00:42:54.000
And this is where I think the civil society community I'm really excited that we are engaging in these questions, because I'm I am unconvinced that the sort of open a eyes of the world are spending a lot of time thinking about these problems right?

00:42:54.000 --> 00:42:55.000
They're just ingesting whatever they can and putting stuff they can.

00:42:55.000 --> 00:43:06.000
And so, you know, finding more ways to fund again, I'll just sort of throw out that like we need your help and support right?

00:43:06.000 --> 00:43:22.000
If you think that having a public interest voice in all of that is important, you know, ensuring that that the organizations that are doing that thinking are well supported and well funded is really important as well.

00:43:22.000 --> 00:43:23.000
You know I'm oh, go ahead, Kevin.

00:43:23.000 --> 00:43:27.000
Jump in on this one!

00:43:27.000 --> 00:43:31.000
Yeah, so yeah, I would have been this one also, from my experience on Wikipedia like this is not a new problem to generative AI.

00:43:31.000 --> 00:43:41.000
It's a, you know, appeared in other things, such as Wikipedia, and for some of the same reasons that Wikipedia is not a primary source.

00:43:41.000 --> 00:43:43.000
It's a secondary source reflecting what is already available in the world.

00:43:43.000 --> 00:43:49.000
Similar to that. The AI is not, you know, generating new knowledge from observing the world.

00:43:49.000 --> 00:43:59.000
It's a generating, generating information based on observing what we've already published and put out there.

00:43:59.000 --> 00:44:07.000
And it's telling us a lot about ourselves when we look at the biases that come out of these systems, it's like taking them and amplifying them.

00:44:07.000 --> 00:44:14.000
And I would love to see more transparency into what it's being trained on, and and like more input into what is getting trained on it.

00:44:14.000 --> 00:44:27.000
If we want it to better reflect the world, you know, and to be able to better reflect the sources that it's not not currently able to draw from and I think we' to be critical of the machine as not just a being.

00:44:27.000 --> 00:44:42.000
A, you know, generating knowledge like this is objectively what is true about the world, but like this is reflecting what we have already known, and maybe like, if we don't like this picture like, maybe we need to be better about the sources that it is coming from going back.

00:44:42.000 --> 00:44:52.000
To a previous question, like one of the issues with the, you know, our kind of boundary list world also is that information is coming out of these diverse from the contexts in which you'd normally see it.

00:44:52.000 --> 00:45:03.000
And I'm particularly also thinking about things like indigenous and traditional knowledge, where you normally see it in a context where it's clear how they would like you to interact with it, like what the community norms are around it.

00:45:03.000 --> 00:45:19.000
And when it's coming out of one of these generative systems like it's, it's missing that context whether you're thinking about tritional knowledge or things that have privacy implications or something with the norms of an academic field and I know in the traditional knowledge

00:45:19.000 --> 00:45:23.000
sphere. They're plenty of people trying to make traditional knowledge work better with our new, our new, like digital universe.

00:45:23.000 --> 00:45:33.000
There's an organization called local contexts, for example, which is trying to put machine readable labels on sources of traditional knowledge to try keep, keep context for things where that would otherwise be missing.

00:45:33.000 --> 00:45:49.000
But but that's you know, it's so difficult when you're just getting something out of the context in which you'd normally see it.

00:45:49.000 --> 00:45:54.000
Well, actually, you know, I mean, we're running a little short on time.

00:45:54.000 --> 00:46:16.000
Here a key theme that I think keeps coming up in the incredibly voluminous chat, and in the discussions we've been having is this, feels like a fire hose both for us personally, as like trying to keep up with literally what happened yesterday, in the Ml and

00:46:16.000 --> 00:46:20.000
for our organizations. Right? Our organizations are, you know, underfunded, understaffed nonprofits.

00:46:20.000 --> 00:46:30.000
And well, my day job isn't underfunded under staff for profit, but the but the same principle applies right.

00:46:30.000 --> 00:46:38.000
We're all trying to keep. How are you all dealing with that fire hose, I mean, are you enjoying it?

00:46:38.000 --> 00:46:42.000
Are you, you know, struggling with it, are are there for those who are just dipping their toes in?

00:46:42.000 --> 00:46:52.000
What would you advise?

00:46:52.000 --> 00:46:53.000
Oh!

00:46:53.000 --> 00:46:54.000
Sign up for Lewis's Newsletter, so I I I will!

00:46:54.000 --> 00:46:55.000
Yes, really good.

00:46:55.000 --> 00:47:01.000
It's that, and really that it is, I will.

00:47:01.000 --> 00:47:08.000
So from my perspective, I feel like copyright world just woke up to AI a couple of months ago.

00:47:08.000 --> 00:47:16.000
But there is a decade long, maybe even older, like group of experts that I've been thinking about.

00:47:16.000 --> 00:47:23.000
AI ethics, safety bias. All of these things for a long time and there are some really smart experts out there.

00:47:23.000 --> 00:47:27.000
And so I think, find the folks you trust, and and follow them closely, is my best advice.

00:47:27.000 --> 00:47:40.000
I don't feel like I have my whole arms and head and brain around everything that's going on.

00:47:40.000 --> 00:47:45.000
But I I do find it exciting. I have to say I do so.

00:47:45.000 --> 00:47:48.000
I'm trying to just maintain the the energy, and I will also.

00:47:48.000 --> 00:47:59.000
A thought that I had over the last day or 2 is there's a lot of people ringing very scary bells about.

00:47:59.000 --> 00:48:04.000
You know what AI might do to our civilization and to our society.

00:48:04.000 --> 00:48:16.000
And I think you have to ask a real optimist about technology to think that that can those kinds of civilization, ending things can happen.

00:48:16.000 --> 00:48:23.000
And I just find that a really interesting tension, right like you have to be very optimistic about how good this technology is going to get to think it's going to bring us down as human beings.

00:48:23.000 --> 00:48:26.000
And so I just think that's really fascinating this sort of simultaneous optimism.

00:48:26.000 --> 00:48:37.000
Pessimism, thing which just as sort of like a look at human nature, is pretty, fascinating, as well.

00:48:37.000 --> 00:48:44.000
I don't think I'm as optimistic about how far this tech is gonna go.

00:48:44.000 --> 00:49:01.000
My feelings that we will find ways to make it useful for us, and the stuff that is, that does not work and isn't useful like. Remember, when everyone was freaking out about nfts, we're not talking about nfts anymore, really because they just didn't turn out to be that interesting

00:49:01.000 --> 00:49:07.000
or useful for people. So I I just think you know, there's a lot here, and I'm excited.

00:49:07.000 --> 00:49:15.000
And yeah, I'll leave it there. What other folks have to say.

00:49:15.000 --> 00:49:27.000
Yeah, I think for me, like the only way to fall like, keep any any semblance of following it is because I think it is enjoyable to follow it like I get a lot of joy from seeing like what have people used AI for today.

00:49:27.000 --> 00:49:28.000
And and it's impossible to be an expert on every aspect.

00:49:28.000 --> 00:49:34.000
So to to pick some that you're going to follow like.

00:49:34.000 --> 00:49:41.000
Obviously from my position, the copyright issues are very interesting, and the.

00:49:41.000 --> 00:49:45.000
Issues where, like copyright people want to you. I like, you know. I like that.

00:49:45.000 --> 00:49:55.000
There are so many, you know, you can't talk about this without addressing the harms of it, but if it's just something that like created a lot of, we wouldn't be talking about it.

00:49:55.000 --> 00:49:58.000
We would just say, like, Okay, let's all agree not to use it.

00:49:58.000 --> 00:50:11.000
We're talking about. AI, because, despite all of the problems that need to be fixed, at something that everybody like wants to use, everybody, you know, at least a large portion of people like are using like find useful, find hope for in the future.

00:50:11.000 --> 00:50:12.000
And like, and so trying to figure out, how do we get those good things?

00:50:12.000 --> 00:50:18.000
And minimize the bad things like that's a but that's great to see something that, like everybody like wants to use wants to make with.

00:50:18.000 --> 00:50:33.000
And then, yeah, and and seeing like as a society, us getting about excited about us. Getting excited about something is like a good thing to follow in the news.

00:50:33.000 --> 00:50:44.000
Yeah, I wanna actually go back on this fire host question is something Leila said at the very start, which is this has made the Internet feel fun again in a way that I think is very enjoyable for a lot of us to follow some of the AI, step really is like cabinet for lawyers.

00:50:44.000 --> 00:50:57.000
Especially because it's presenting a lot of really novel questions that the law hasn't hasn't fully addressed, and that are very interesting to think about and speculate about.

00:50:57.000 --> 00:51:04.000
But I think like the way that the fire hose of content comes through is in some ways it's actually not a new problem there's there's a lot more like public interest in it.

00:51:04.000 --> 00:51:22.000
Right now, and I think people are worried that there's going to be a lot of content generated through like chat bots and other things that that like we will need to think about how to address that and how to make sure we have safeguards for identifying that content and making sure it's not used in

00:51:22.000 --> 00:51:36.000
inappropriate places, but, like I mentioned with the Sweden thing like, there have been ways to program bots to generate mass, low quality, content for years and years, and there have been lots of people that have tried to do that.

00:51:36.000 --> 00:51:41.000
And it actually doesn't take very many people running bots generating low-quality, content to get plenty of it.

00:51:41.000 --> 00:51:44.000
So in some ways it's a problem that people already know about.

00:51:44.000 --> 00:51:54.000
And you' you can look at it as like a problem of, you know, spam and of identifying good sources, and of the kind of promises that of like methods that organizations use.

00:51:54.000 --> 00:51:59.000
You know, if a news site is using a good journalistic method, you can trust more or less what content is on there because they are having people check that over.

00:51:59.000 --> 00:52:20.000
You know, if we competedia is using a good method of having users check reliable sources, you can more or less trust the content that's on there that that people are putting in because they are doing that kind of checking whereas there probably are a lot of sites out there where there really is

00:52:20.000 --> 00:52:24.000
just mass posting of generated content, already and it's only going to get worse.

00:52:24.000 --> 00:52:32.000
And and those will, I think, probably lose a lot of trust because you're just. You don't know what you're looking at. You don't know if it's accurate.

00:52:32.000 --> 00:52:45.000
You don't, and that does get. There was a question in the chat a while back that I thought was very interesting about the way that all of this affects like or like education, for, like young people right now, and that's that's a much bigger topic.

00:52:45.000 --> 00:52:46.000
That we certainly don't have the time to. That could be its own webinar or many webinars.

00:52:46.000 --> 00:52:55.000
But I just want to fly, get something to think about, because this question of like media literacy and being able to identify whether you're looking at something that is good quality or is not good quality is going to be really important.

00:52:55.000 --> 00:53:03.000
I think.

00:53:03.000 --> 00:53:13.000
Yeah, boy, that last one really hits home. I've definitely had some people that very smart intelligents just get people that have had conversations with like, Oh, I'd forgotten about X, but open them.

00:53:13.000 --> 00:53:20.000
But chat gpt reminded me, and I'm like the reason you'd forgotten about it is because it doesn't exist in Chat. Gtt.

00:53:20.000 --> 00:53:23.000
Hallucinated it, and you know the literacy component.

00:53:23.000 --> 00:53:31.000
There! It's scary, right, you know.

00:53:31.000 --> 00:53:41.000
It's such an interesting tension here, something that I think about a lot is that none of us, I mean, I look in the backgrounds, Jacob, your background is book free.

00:53:41.000 --> 00:53:51.000
But Kat and Lela and I know elsewhere in your house for those who don't know Jacob, because he just moved, and all his books are still in boxes.

00:53:51.000 --> 00:54:08.000
I'm sure there will be books behind him soon. We all love, you know everybody here is a lover of the printed word, the printing press also caused a hundred years of violent warfare across Europe, and I feel like we're really gonna struggle with that dichotomy of none of

00:54:08.000 --> 00:54:17.000
us would want to go back to a time before the printing press, but also I'm not sure any of us would necessarily have wanted to leave tomorrow.

00:54:17.000 --> 00:54:33.000
Those 100 years of warfare, and I feel like we're all struggling a little bit across many conversations I'm having where struggling with how do we balance that, joyful exuberance for a you know, Jacob, as he called it it is I mean it

00:54:33.000 --> 00:54:37.000
is 1,000% cabinet for lawyers. Right? And how do we?

00:54:37.000 --> 00:54:43.000
How do we juggle that exuberance with with this realization that it is gonna be?

00:54:43.000 --> 00:54:47.000
I mean the strike. The writer strike!

00:54:47.000 --> 00:54:59.000
That happened last night is in no small part, because people are genuinely anxious about people who create text for a living are genuinely anxious about their futures.

00:54:59.000 --> 00:55:04.000
Right and I don't know that we have. I think, all of our organizations are thinking hard about that.

00:55:04.000 --> 00:55:08.000
Certainly governments are thinking hard about that, but we don't.

00:55:08.000 --> 00:55:13.000
None of us have good or easy answers, and I hope that none of the audience comes away, even though everybody here is excited.

00:55:13.000 --> 00:55:24.000
I hope none of us come. No one comes away with the idea that we're glib about these challenges, because certainly it's something we've all been thinking about.

00:55:24.000 --> 00:55:32.000
We only have a few minutes left. I know that. Oh, one thing, actually, I wanted to say, all of these organizations are talking.

00:55:32.000 --> 00:55:33.000
This is not the first time that these organizations are talking about these topics.

00:55:33.000 --> 00:55:57.000
It should not, perhaps should not surprise you, and so, if keep an eye out in the next, either today or tomorrow, the movement for a better Internet will be publishing a statement on these kinds of issues, which includes all of the organizations here, as well as a large variety of other public interest

00:55:57.000 --> 00:56:04.000
oriented groups. So this will be hopefully the start of public conversations.

00:56:04.000 --> 00:56:14.000
Not, certainly not the end of it. Pretty excited to see what comes out of those kinds of those kinds of discussions, boy, just synthesizing all the meeting chats.

00:56:14.000 --> 00:56:23.000
Gonna be great. We, I think that Ia was gonna step in with a few closing remarks.

00:56:23.000 --> 00:56:27.000
Now that we're almost at the top of the hour, is that?

00:56:27.000 --> 00:56:28.000
That's right. Louis. Yeah, thanks. Thanks for that.

00:56:28.000 --> 00:56:34.000
So look, first thing is, I want to thank everyone who's here today.

00:56:34.000 --> 00:56:39.000
You have all like the chat. This is the most lively and safest chat that we've had.

00:56:39.000 --> 00:56:42.000
Honestly, there's like not a single thing that we needed to like, cringe or worry about.

00:56:42.000 --> 00:56:50.000
There are people as I mentioned, at the top, working behind the scenes to make sure that this space is safe for all.

00:56:50.000 --> 00:56:51.000
All attendees, and you've been awesome today. So thank you.

00:56:51.000 --> 00:56:58.000
Audience, for for being good actors and and showing up today.

00:56:58.000 --> 00:56:59.000
So look, I can sense that there are more questions so I want to invite folks to stay on for a little.

00:56:59.000 --> 00:57:17.000
After party after the top of the hour. So as we wind down here today, I do want to tell you about a couple of events that I think you're going to want to check out so first up on May ninth journalist and Editor Maria Bastillos who's here with us in the

00:57:17.000 --> 00:57:22.000
audience we'll chat with Author Jessica Silby about her latest book and against progress.

00:57:22.000 --> 00:57:29.000
So be examines the experiences of everyday creators and innovators, navigating ownership, sharing and sustainability within the Internet ecosystem and creating.

00:57:29.000 --> 00:57:36.000
IP laws, so that conversation again will be on May ninth, and I'm sure that you won't want to miss it.

00:57:36.000 --> 00:57:50.000
You can use the link that Duncan has shared out in chat to register, and a moment here to thank Duncan and Caitlin and Kevin, who are all working behind the scenes on today's conversation.

00:57:50.000 --> 00:57:53.000
Then so that's so. That's next week.

00:57:53.000 --> 00:57:58.000
And then on the ninth, and then on the on the eleventh.

00:57:58.000 --> 00:58:04.000
Those of you who are in the Bay area, who either live in the Bay area or gonna be in the Bay Area next week.

00:58:04.000 --> 00:58:11.000
We have a great get-together plan. I think that you're gonna wanna to come by. I believe that's on Thursday of next week. We have a great get-together plan.

00:58:11.000 --> 00:58:18.000
I think that you're gonna wanna to come by. I believe that's on Thursday of next week. We're hosting an in person book talk with this story and Lane Mooney for their new book, the Apple Ii Age Lane will discuss the history.

00:58:18.000 --> 00:58:24.000
Of the Apple 2, and how the computer became personal during the session which starts at 6 Pm.

00:58:24.000 --> 00:58:32.000
Next Thursday at 300, Funston, so as we start to close down here, and then, before we kick off the after-party, I do want to give some commitments and some thank yous.

00:58:32.000 --> 00:58:39.000
So we said this a few times, but it bears repeating the recording and the chat from today's session will be archived on archive.org.

00:58:39.000 --> 00:58:49.000
Later tonight you all will have access to to those around. Tomorrow we'll send an email out with the links to all the resources that we've shared here today and and yeah, so you'll get that email tomorrow.

00:58:49.000 --> 00:59:01.000
As for, thank yous, a big thank you to our speakers for joining us today for keeping the conversation lively.

00:59:01.000 --> 00:59:07.000
Thanks to Lukas for facilitating the discussion, Lewis, you get the gold.

00:59:07.000 --> 00:59:08.000
Start the your work. In facilitating this conversation was really just spectacular.

00:59:08.000 --> 00:59:15.000
So thank you for doing that. And again to the to everyone in the audience.

00:59:15.000 --> 00:59:29.000
Thank you for for showing up for being responsible guest here in our conversation today, and for really adding so much to to the chat what, again, like, we had 400 people here, 407 at Peak.

00:59:29.000 --> 00:59:36.000
This was a very very large webinar, and I think this is probably one of the the liveliest chats that I've seen.

00:59:36.000 --> 00:59:40.000
That is really information rich and caning to go. So I do.

00:59:40.000 --> 00:59:44.000
Wanna say you know, thanks everyone for for showing up here today.

00:59:44.000 --> 00:59:52.000
And really, just thank